Efficient multiscale and multifont optical character recognition system based on robust feature description
Identifieur interne : 000014 ( Main/Exploration ); précédent : 000013; suivant : 000015Efficient multiscale and multifont optical character recognition system based on robust feature description
Auteurs : Mahmoud Soua [France] ; Rostom Kachouri [France] ; Mohamed Akil [France]Source :
English descriptors
Abstract
Optical Character Recognition (OCR) is the process of translating images of text into a comprehensible machine format. Generally, an OCR system is composed of binariza-tion, segmentation and recognition stages. Given an extracted binary character, the recognition stage ensures its description and decides its corresponding ASCII code. In this paper, we propose a new OCR system that aims to high speed, Multiscale and Multifont character recognition. Our proposal is based essentially on robust description using a new Unified Character Descriptor (UCD). In addition, a character type-face and font-size recognition is performed to choose the adequate template for faster matching process. Obtained OCR Accuracy of our proposed System is 1.5x higher then that reached by Tesseract on the LRDE dataset.
Url:
DOI: 10.1109/IPTA.2015.7367214
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 000043
- to stream Hal, to step Curation: 000043
- to stream Hal, to step Checkpoint: 000002
- to stream Main, to step Merge: 000014
- to stream Main, to step Curation: 000014
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Efficient multiscale and multifont optical character recognition system based on robust feature description</title>
<author><name sortKey="Soua, Mahmoud" sort="Soua, Mahmoud" uniqKey="Soua M" first="Mahmoud" last="Soua">Mahmoud Soua</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Kachouri, Rostom" sort="Kachouri, Rostom" uniqKey="Kachouri R" first="Rostom" last="Kachouri">Rostom Kachouri</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Akil, Mohamed" sort="Akil, Mohamed" uniqKey="Akil M" first="Mohamed" last="Akil">Mohamed Akil</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01309987</idno>
<idno type="halId">hal-01309987</idno>
<idno type="halUri">https://hal-upec-upem.archives-ouvertes.fr/hal-01309987</idno>
<idno type="url">https://hal-upec-upem.archives-ouvertes.fr/hal-01309987</idno>
<idno type="doi">10.1109/IPTA.2015.7367214</idno>
<date when="2015-11-10">2015-11-10</date>
<idno type="wicri:Area/Hal/Corpus">000043</idno>
<idno type="wicri:Area/Hal/Curation">000043</idno>
<idno type="wicri:Area/Hal/Checkpoint">000002</idno>
<idno type="wicri:Area/Main/Merge">000014</idno>
<idno type="wicri:Area/Main/Curation">000014</idno>
<idno type="wicri:Area/Main/Exploration">000014</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Efficient multiscale and multifont optical character recognition system based on robust feature description</title>
<author><name sortKey="Soua, Mahmoud" sort="Soua, Mahmoud" uniqKey="Soua M" first="Mahmoud" last="Soua">Mahmoud Soua</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Kachouri, Rostom" sort="Kachouri, Rostom" uniqKey="Kachouri R" first="Rostom" last="Kachouri">Rostom Kachouri</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Akil, Mohamed" sort="Akil, Mohamed" uniqKey="Akil M" first="Mohamed" last="Akil">Mohamed Akil</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-3210" status="VALID"><idno type="RNSR">200212717U</idno>
<orgName>Laboratoire d'Informatique Gaspard-Monge</orgName>
<orgName type="acronym">LIGM</orgName>
<desc><address><addrLine>Université de Paris-Est - Marne-la-Vallée, Cité Descartes, Bâtiment Copernic, 5 bd Descartes, 77454 Marne-la-Vallée Cedex 2, Inst Gaspard Monge</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://ligm.u-pem.fr</ref>
</desc>
<listRelation><relation active="#struct-301243" type="direct"></relation>
<relation active="#struct-301545" type="direct"></relation>
<relation active="#struct-302085" type="direct"></relation>
<relation active="#struct-304949" type="direct"></relation>
<relation name="UMR8049" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-301243" type="direct"><org type="institution" xml:id="struct-301243" status="VALID"><orgName>Université Paris-Est Marne-la-Vallée</orgName>
<orgName type="acronym">UPEM</orgName>
<desc><address><addrLine>5 boulevard Descartes - Champs-sur-Marne - 77454 Marne-la-Vallée Cedex2 </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.u-pem.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301545" type="direct"><org type="institution" xml:id="struct-301545" status="OLD"><orgName>École des Ponts ParisTech (ENPC)</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302085" type="direct"><org type="institution" xml:id="struct-302085" status="VALID"><orgName>Fédération de Recherche Bézout</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-304949" type="direct"><org type="institution" xml:id="struct-304949" status="INCOMING"><orgName>ESIEE</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR8049" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
</analytic>
<idno type="DOI">10.1109/IPTA.2015.7367214</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="mix" xml:lang="en"><term>Feature Ex-traction</term>
<term>Feature Matching</term>
<term>Multifont</term>
<term>Multiscale</term>
<term>OCR System</term>
<term>SAD technique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Optical Character Recognition (OCR) is the process of translating images of text into a comprehensible machine format. Generally, an OCR system is composed of binariza-tion, segmentation and recognition stages. Given an extracted binary character, the recognition stage ensures its description and decides its corresponding ASCII code. In this paper, we propose a new OCR system that aims to high speed, Multiscale and Multifont character recognition. Our proposal is based essentially on robust description using a new Unified Character Descriptor (UCD). In addition, a character type-face and font-size recognition is performed to choose the adequate template for faster matching process. Obtained OCR Accuracy of our proposed System is 1.5x higher then that reached by Tesseract on the LRDE dataset.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
</list>
<tree><country name="France"><noRegion><name sortKey="Soua, Mahmoud" sort="Soua, Mahmoud" uniqKey="Soua M" first="Mahmoud" last="Soua">Mahmoud Soua</name>
</noRegion>
<name sortKey="Akil, Mohamed" sort="Akil, Mohamed" uniqKey="Akil M" first="Mohamed" last="Akil">Mohamed Akil</name>
<name sortKey="Kachouri, Rostom" sort="Kachouri, Rostom" uniqKey="Kachouri R" first="Rostom" last="Kachouri">Rostom Kachouri</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000014 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000014 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-01309987 |texte= Efficient multiscale and multifont optical character recognition system based on robust feature description }}
This area was generated with Dilib version V0.6.32. |